Learning to Mine Query Subtopics from Query Log
نویسندگان
چکیده
Many queries in web search are ambiguous or multifaceted. Identifying the major senses or facets of queries is very important for web search. In this paper, we represent the major senses or facets of queries as subtopics and refer to indentifying senses or facets of queries as query subtopic mining, where query subtopic are represented as a number of clusters of queries. Then the challenges of query subtopic mining are how to measure the similarity between queries and group them semantically. This paper proposes an approach for mining subtopics from query log, which jointly learns a similarity measure and groups queries by explicitly modeling the structure among them. Compared with previous approaches using manually defined similarity measures, our approach produces more desirable query subtopics by learning a similarity measure. Experimental results on real queries collected from a search engine log confirm the effectiveness of the proposed approach in mining query subtopics.
منابع مشابه
Qualifier Mining for NTCIR-INTENT
We participated the Subtopic Mining subtask of NTCIR-9 INTENT task. Query Log is used as the primary resource to mine latent subtopics. Through analysis of query log, we observed that queries describing similar information needs will use a similar group of qualifiers, which may also frequently occur together within queries. We introduced the concept of qualifier graph for subtopic mining. To so...
متن کاملDiscovering Popular Clicks\' Pattern of Teen Users for Query Recommendation
Search engines are still the most important gates for information search in internet. In this regard, providing the best response in the shortest time possible to the user's request is still desired. Normally, search engines are designed for adults and few policies have been employed considering teen users. Teen users are more biased in clicking the results list than are adult users. This leads...
متن کاملQuery Subtopic Mining via Subtractive Initialization of Non-negative Sparse Latent Semantic Analysis
Ambiguous and multifaceted queries widely exist in academic and commercial search engines. Identifying the popular subtopics of queries is an important issue for search engines. In this paper, we propose a novel method to discover the popular subtopics for a given query. Our method first constructs a search behavior tripartite graph based on the search log data. Then, we utilize a subtractive i...
متن کاملTHUIR at NTCIR-9 INTENT Task
This is the first year IR group of Tsinghua University (THUIR) participates in NTCIR. We register the INTENT task and focus on the Chinese topics of subtopic mining and document ranking subtask. In our experiments, we try to mine subtopics from different resources, namely query recommendation, Wikipedia and the query-URL bipartite graph which is constructed by clickthrough data. We also develop...
متن کاملAnalysis of User query refinement behavior based on semantic features: user log analysis of Ganj database (IranDoc)
Background and Aim: Information systems cannot be well designed or developed without a clear understanding of needs of users, manner of their information seeking and evaluating. This research has been designed to analyze the Ganj (Iranian research institute of science and technology database) users’ query refinement behaviors via log analysis. Methods: The method of this research is log anal...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015